AITopics | decision-focused learning

Collaborating Authors

decision-focused learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LearningMDPsfromFeatures: Predict-Then-OptimizeforSequentialDecision ProblemsbyReinforcementLearning

Neural Information Processing SystemsFeb-8-2026, 12:35:06 GMT

To resolve the first challenge, we propose to sample anestimate ofthefirst-order andsecond-order derivativestoapproximate theoptimality andKKT conditions.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Ohio > Franklin County > Columbus (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.96)

Add feedback

Learning MDPs from Features: Predict-Then-Optimize for Sequential Decision Making by Reinforcement Learning

Neural Information Processing SystemsDec-24-2025, 02:02:30 GMT

In the predict-then-optimize framework, the objective is to train a predictive model, mapping from environment features to parameters of an optimization problem, which maximizes decision quality when the optimization is subsequently solved. Recent work on decision-focused learning shows that embedding the optimization problem in the training pipeline can improve decision quality and help generalize better to unseen tasks compared to relying on an intermediate loss function for evaluating prediction quality. We study the predict-then-optimize framework in the context of sequential decision problems (formulated as MDPs) that are solved via reinforcement learning. In particular, we are given environment features and a set of trajectories from training MDPs, which we use to train a predictive model that generalizes to unseen test MDPs without trajectories. Two significant computational challenges arise in applying decision-focused learning to MDPs: (i) large state and action spaces make it infeasible for existing techniques to differentiate through MDP problems, and (ii) the high-dimensional policy space, as parameterized by a neural network, makes differentiating through a policy expensive. We resolve the first challenge by sampling provably unbiased derivatives to approximate and differentiate through optimality conditions, and the second challenge by using a low-rank approximation to the high-dimensional sample-based derivatives. We implement both Bellman-based and policy gradient-based decision-focused learning on three different MDP problems with missing parameters, and show that decision-focused learning performs better in generalization to unseen tasks.

learning mdp, predict-then-optimize, sequential decision, (12 more...)

Neural Information Processing Systems

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.42)

Add feedback

Decision-Focused Learning without Decision-Making: Learning Locally Optimized Decision Losses

Neural Information Processing SystemsDec-23-2025, 17:51:47 GMT

Decision-Focused Learning (DFL) is a paradigm for tailoring a predictive model to a downstream optimization task that uses its predictions in order to perform better \textit{on that specific task}. The main technical challenge associated with DFL is that it requires being able to differentiate through the optimization problem, which is difficult due to discontinuous solutions and other challenges. Past work has largely gotten around this this issue by \textit{handcrafting} task-specific surrogates to the original optimization problem that provide informative gradients when differentiated through. However, the need to handcraft surrogates for each new task limits the usability of DFL. In addition, there are often no guarantees about the convexity of the resulting surrogates and, as a result, training a predictive model using them can lead to inferior local optima. In this paper, we do away with surrogates altogether and instead \textit{learn} loss functions that capture task-specific information. To the best of our knowledge, ours is the first approach that entirely replaces the optimization component of decision-focused learning with a loss that is automatically learned. Our approach (a) only requires access to a black-box oracle that can solve the optimization problem and is thus \textit{generalizable}, and (b) can be \textit{convex by construction} and so can be easily optimized over. We evaluate our approach on three resource allocation problems from the literature and find that our approach outperforms learning without taking into account task-structure in all three domains, and even hand-crafted surrogates from the literature.

decision-focused learning, decision-making, name change, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)

Add feedback

A Dual Perspective on Decision-Focused Learning: Scalable Training via Dual-Guided Surrogates

Rodriguez-Diaz, Paula, Paulson, Kirk Bansak Elisabeth

arXiv.org Artificial IntelligenceNov-10-2025

Many real-world decisions are made under uncertainty by solving optimization problems using predicted quantities. This predict-then-optimize paradigm has motivated decision-focused learning, which trains models with awareness of how the optimizer uses predictions, improving the performance of downstream decisions. Despite its promise, scaling is challenging: state-of-the-art methods either differentiate through a solver or rely on task-specific surrogates, both of which require frequent and expensive calls to an optimizer, often a combinatorial one. In this paper, we leverage dual variables from the downstream problem to shape learning and introduce Dual-Guided Loss (DGL), a simple, scalable objective that preserves decision alignment while reducing solver dependence. We construct DGL specifically for combinatorial selection problems with natural one-of-many constraints, such as matching, knapsack, and shortest path. Our approach (a) decouples optimization from gradient updates by solving the downstream problem only periodically; (b) between refreshes, trains on dual-adjusted targets using simple differentiable surrogate losses; and (c) as refreshes become less frequent, drives training cost toward standard supervised learning while retaining strong decision alignment. We prove that DGL has asymptotically diminishing decision regret, analyze runtime complexity, and show on two problem classes that DGL matches or exceeds state-of-the-art DFL methods while using far fewer solver calls and substantially less training time. Code is available at https://github.com/

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2511.04909

Country: North America > United States > California > Alameda County > Berkeley (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

Appendix A Proofs

Neural Information Processing SystemsAug-14-2025, 07:48:56 GMT

The first part of the proof follows the policy gradient theorem. This concludes the proof of Theorem 1.Theorem 2 Since the second-order derivative formulation as stated in Theorem 1 and Theorem 2 are both unbiased derivative estimate. The randomly initiated neural network uses ReLU layers as nonlinearity followed by a linear layer in the end. In order to train the optimal policy, in the gridworld example, we use tabular value-iteration algorithm to learn the Q value of each state action pair. So the number of available actions is 5, while the number of available states is 5 5 = 25 .

artificial intelligence, machine learning, trajectory, (19 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.98)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.50)

Add feedback

49e863b146f3b5470ee222ee84669b1c-Paper.pdf

Neural Information Processing SystemsAug-14-2025, 07:48:45 GMT

machine learning, reinforcement learning, trajectory, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
North America > United States > Ohio > Franklin County > Columbus (0.04)
Asia > India (0.04)

Genre: Research Report (0.68)

Industry: Health & Medicine > Therapeutic Area (0.69)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Decision-Focused Learning with Directional Gradients

Neural Information Processing SystemsMay-31-2025, 10:11:41 GMT

We propose a novel family of decision-aware surrogate losses, called Perturbation Gradient (PG) losses, for the predict-then-optimize framework. These losses directly approximate the downstream decision loss and can be optimized using off-the-shelf gradient-based methods. Importantly, unlike existing surrogate losses, the approximation error of our PG losses vanishes as the number of samples grows. This implies that optimizing our surrogate loss yields a best-in-class policy asymptotically, even in misspecified settings. This is the first such result in misspecified settings and we provide numerical evidence confirming our PG losses substantively outperform existing proposals when the underlying model is misspecified and the noise is not centrally symmetric.

decision-focused learning, directional gradient, surrogate loss

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

Online Decision-Focused Learning

Capitaine, Aymeric, Haddouche, Maxime, Moulines, Eric, Jordan, Michael I., Boursier, Etienne, Durmus, Alain

arXiv.org Machine LearningMay-21-2025

Decision-focused learning (DFL) is an increasingly popular paradigm for training predictive models whose outputs are used in decision-making tasks. Instead of merely optimizing for predictive accuracy, DFL trains models to directly minimize the loss associated with downstream decisions. This end-to-end strategy holds promise for tackling complex combinatorial problems; however, existing studies focus solely on scenarios where a fixed batch of data is available and the objective function does not change over time. We instead investigate DFL in dynamic environments where the objective function and data distribution evolve over time. This setting is challenging because the objective function has zero or undefined gradients -- which prevents the use of standard first-order optimization methods -- and is generally non-convex. To address these difficulties, we (i) regularize the objective to make it differentiable and (ii) make use of the optimism principle, based on a near-optimal oracle along with an appropriate perturbation. This leads to a practical online algorithm for which we establish bounds on the expected dynamic regret, both when the decision space is a simplex and when it is a general bounded convex polytope. Finally, we demonstrate the effectiveness of our algorithm by comparing its performance with a classic prediction-focused approach on a simple knapsack experiment.

artificial intelligence, data mining, machine learning, (18 more...)

arXiv.org Machine Learning

2505.13564

Country:

North America > United States (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.92)

Industry:

Health & Medicine (0.93)
Retail (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Decision-Focused Fine-Tuning of Time Series Foundation Models for Dispatchable Feeder Optimization

Beichter, Maximilian, Friederich, Nils, Pinter, Janik, Werling, Dorina, Phipps, Kaleb, Beichter, Sebastian, Neumann, Oliver, Mikut, Ralf, Hagenmeyer, Veit, Heidrich, Benedikt

arXiv.org Machine LearningMar-3-2025

Time series foundation models provide a universal solution for generating forecasts to support optimization problems in energy systems. Those foundation models are typically trained in a prediction-focused manner to maximize forecast quality. In contrast, decision-focused learning directly improves the resulting value of the forecast in downstream optimization rather than merely maximizing forecasting quality. The practical integration of forecast values into forecasting models is challenging, particularly when addressing complex applications with diverse instances, such as buildings. This becomes even more complicated when instances possess specific characteristics that require instance-specific, tailored predictions to increase the forecast value. To tackle this challenge, we use decision-focused fine-tuning within time series foundation models to offer a scalable and efficient solution for decision-focused learning applied to the dispatchable feeder optimization problem. To obtain more robust predictions for scarce building data, we use Moirai as a state-of-the-art foundation model, which offers robust and generalized results with few-shot parameter-efficient fine-tuning. Comparing the decision-focused fine-tuned Moirai with a state-of-the-art classical prediction-focused fine-tuning Morai, we observe an improvement of 9.45% in average total daily costs.

fine-tuning, foundation model, optimization problem, (12 more...)

arXiv.org Machine Learning

2503.01936

Country:

Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
Europe > Germany > Baden-Württemberg > Karlsruhe Region > Karlsruhe (0.04)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Solar (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science (0.93)

Add feedback

Decision-Focused Learning for Complex System Identification: HVAC Management System Application

Favaro, Pietro, Toubeau, Jean-François, Vallée, François, Dvorkin, Yury

arXiv.org Artificial IntelligenceJan-24-2025

As opposed to conventional training methods tailored to minimize a given statistical metric or task-agnostic loss (e.g., mean squared error), Decision-Focused Learning (DFL) trains machine learning models for optimal performance in downstream decision-making tools. We argue that DFL can be leveraged to learn the parameters of system dynamics, expressed as constraint of the convex optimization control policy, while the system control signal is being optimized, thus creating an end-to-end learning framework. This is particularly relevant for systems in which behavior changes once the control policy is applied, hence rendering historical data less applicable. The proposed approach can perform system identification - i.e., determine appropriate parameters for the system analytical model - and control simultaneously to ensure that the model's accuracy is focused on areas most relevant to control. Furthermore, because black-box systems are non-differentiable, we design a loss function that requires solely to measure the system response. We propose pre-training on historical data and constraint relaxation to stabilize the DFL and deal with potential infeasibilities in learning. We demonstrate the usefulness of the method on a building Heating, Ventilation, and Air Conditioning day-ahead management system for a realistic 15-zone building located in Denver, US. The results show that the conventional RC building model, with the parameters obtained from historical data using supervised learning, underestimates HVAC electrical power consumption. For our case study, the ex-post cost is on average six times higher than the expected one. Meanwhile, the same RC model with parameters obtained via DFL underestimates the ex-post cost only by 3%.

artificial intelligence, machine learning, optimization problem, (13 more...)

arXiv.org Artificial Intelligence

2501.14708

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > Colorado > Denver County > Denver (0.04)
North America > United States > New York > New York County > New York City (0.04)
(10 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Energy > Power Industry (1.00)
Construction & Engineering > HVAC (0.75)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback